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(54) System for user control of version synchronization in mobile computing 

(57) A universal system is provided for synchroniz- 
ing servers which accommodates wide area mobile 
computing while at the same time making the process 
more efficient. The system includes a network of pri- 
mary servers with high performance reliable links mak- 
ing up the backbone of the synchronization process to 
which secondary servers are linked via typically less 
reliable links. Moreover, synchronization from a mobile 
computer can be done whether in client/ server mode, 
or peer-to-peer to support any topology of secondary 
servers. In one embodiment while the primary servers 
are automatically and frequently synchronized, syn- 
chronization of the secondary servers is under the con- 
trol of the user which prevents unintended 
synchronization. A summarizing version vector is used 
to minimize the amount of data transmitted by avoiding 
the necessity for exchanging version vectors for individ- 
ual objects. This summarizing version vector also per- 
mits differential synchronization using summarizing 
version vectors and update stamps, the generation of a 
latest common version vector to purge off differential 
updates on a server, restart of synchronization from the 
point of previous failure with data from an unaffected 
server, and fine grain synchronization by permitting a 
differential update as the atom of data to be transmitted. 
Additionally, the system automatically switches between 
whole object synchronization and differential synchroni- 
zation. Further, the subject system permits synchroni- 
zation between different systems because the 
semantics of the data is segregated from the synchroni- 
zation due to extracting updates in a standard format 
and synchronizing based on a standard protocol. 
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Description 

BACKGROUND OF THE INVENTION 

[0001] It will be appreciated that with the proliferation 5 
of mobile computing in the form of laptops and the gen- 
eration of versions of data at various points, there is a 
need to be able to synchronize the versions of the data 
created at multiple locations. This version synchroniza- 
tion must take place over a variety of different platforms io 
as individuals move from location to location, state to 
state, and country to country. 

[0002] When multiple users work on the same version 
of a document or data, in the past there have been var- 
ious ways of reconciling the documents. Many systems is 
have been provided to alert one user to the fact that 
another document has been altered and to either 
update with the latest version or reject the latest version 
at the user's discretion. Log files have been used in the 
past to ascertain the time that a version was generated 20 
and version vectors have been utilized for some time to 
detect conflicts between object replicas. 
[0003] As will be appreciated, there have been several 
approaches to the detection of resolving update con- 
flicts such as described by' James J. KISTLER et al., 25 
Disconnected Operation in the CODA File System, 
ACM Transactions on Computer Systems, 10(1), 1992; 
Peter Reiher et al., Resolving File Conflicts in the Ficus 
File System, in USENIX Conference Proceedings, USE- 
NIX, June 1994; and, Douglas B. Terry et al., Managing 30 
Update Conflicts in Bayou, a Weekly Connected Repli- 
cated Storage System, in Proceedings of the Fifth Sym- 
posium on Operating System Principles, ASM, 
December 1995. The CODA and Ficus systems utilize 
version vectors to detect conflicts between object repli- 35 
cas as first proposed by D. Stott Parker et al., Detection 
of Mutual Inconsistency in Distributed Systems, IEEE 
Transactions on Software Engineering 9(3), May 1993, 
where the Bayou system does not detect conflicts itself, 
but rather relies on the application. 
[0004] Conflict detection utilizing version vectors has 
evolved into two basic paradigms. One paradigm, as 
exemplified by Ficus, involves utilizing a version vector 
to characterize an individual object replica's current 
state relative to other replicas. Another paradigm which 
can be inferred by the CODA system is applying version 
vectors to both object replicas and the replication unit 
that contains the set of objects. 
[0005] With respect to the CODA system it will be 
appreciated that it is a file replication system which does 
not support peer-to-peer synchronization. It is in 
essence a client/server system which will not allow two 
clients to synchronize directly with each other. For 
example, if two mobile users want to exchange each 
other's data this must be a peer-to-peer synchronization 
if they didn't know each other before. The only way two 
such clients can possibly synchronize their data is via a 
common server. If the two clients become discon- 



nected, then no synchronization can occur in real time. 
The CODA system is also not well suited to mobile com- 
puting because of the way it caches data and synchro- 
nizes with the CODA servers. In CODA when a server is 
opened, the client always tries to get the latest version 
of the file among all servers available. When the file is 
closed the client always tries to synchronize with all 
servers available. This is extremely expensive due to 
the heavy data transmission. Moreover, CODA cannot 
support disconnected operation if a file access was not 
cached on the client prior to the disconnection. This is 
because in CODA the data is not replicated as a whole 
but rather the files are cached one by one. Thus the 
CODA system is complicated when trying to apply it to 
mobile computing as it requires parallel synchronization 
with multiple servers with every open and close of a file. 
[0006] With respect to the Ficus system, it is a file rep- 
lication system which does not support client/server 
synchronization. It is essentially a peer-to-peer system 
and as such all synchronization between replication 
sites is automated. This is not economical in the mobile 
computing environment where it may not be desirable 
because of the expensive and unreliable nature of the 
communications links. The Ficus system may also vio- 
late the semantics of an application due to the auto- 
matic synchronization, making this system not 
applicable in such cases. Since synchronization in 
Ficus is realized by using version vectors maintained by 
individual files, the version vectors need to be 
exchanged for each file. This is considered not efficient 
for wireless communication. For example, all the version 
vectors of files must be transmitted between servers, 
although the files on both servers may be the same. 
Also, the Ficus system does not support differential syn- 
chronization because it sends whole files during the 
synchronization process in an all-or-none manner. 
[0007] Finally, the Bayou system is a data base repli- 
cation system which supports differential synchroniza- 
tion by utilizing a log of writes applied to the database. 
However, the Bayou system imposes using a single pri- 
mary server to globally finalize the order and commit- 
ment of writes. A single primary server, if down, delays 
the final commitment of the writes. Thus if the primary 
server is down, the final order of the writes cannot be 
established. This delays the ability of other servers to 
see the stabilized database as soon as possible. More- 
over, the Bayou system requires applications to add 
conflict detection and resolution code to every write they 
generate. This requirement leads to problems. First, the 
conflict detection is not done by the system, but by the 
application, which can be a burden to some of the appli- 
cations. Secondly, this code for conflict detection and 
resolution must be propagated along with each of the 
writes which can greatly increase the cost and traffic of 
data synchronization. 

[0008] Moreover, all of the above systems are not flex- 
ible enough to deal with different data types across dif- 
ferent systems. 
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[0009] It is therefore necessary to provide a system 
which can cope with all of the problems described 
above, especially for the mobile computing environ- 
ment. 

5 

SUMMARY OF T HE INVENTION 

[0010] In order to solve the above problems, espe- 
cially in a mobile computing environment in which links 
to either a central server or peer-to-peer system are w 
oftentimes unreliable, a system is provided for reliable 
synchronization of versions of an object stored at differ- 
ent servers which involves the replacement of either the 
single central server or a peer-to-peer server system 
with a network of primary servers linked with high per- is 
formance reliable links which serve to synchronize sec- 
ondary servers. The use of the primary servers permits 
the data to be distributed in a wider area with better data 
consistency and more reliability because of the use of 
multiple primary servers. By making the mobile compu- 20 
ter a secondary server, this distinguishes these less reli- 
able servers from those servers which are the backbone 
of the synchronization process. 
[0011] In one embodiment, the primary servers are 
automatically and frequently synchronized in order to 25 
maintain good data consistency among the primary 
servers. On the other hand, the less reliable synchroni- 
zation from the secondary servers is controlled from the 
secondary servers so that communication costs can be 
reduced and so that the document or other data will not 30 
be synchronized unless authorized by the user at the 
secondary server. This permits the user to work with 
drafts of documents or uncompleted software or pro- 
gramming which is not to be synchronized before it is 
finished. 35 
[0012] As an important feature, in the subject system 
a summarizing version vector is used to minimize the 
amount of data transmitted in the synchronizing process 
by avoiding the necessity for exchanging version vec- 
tors for individual objects, whether or not there is any 40 
difference in the two objects being synchronized. 
[0013] The term summarizing version vector as used 
herein means a vector having fields which summarize 
the state of the object container at a server. Each sum- 
marizing version vector is a vector of update stamps. 45 
Each update stamp has a field for the associated object 
container's identifier and a field for the associated time 
stamp. 

[0014] In the case that the objects are not the same, 
in the prior art, version vectors for each object at a first so 
location were sent to a second location, with a corre- 
sponding update being sent back to the first location. 
This created an a situation in which needless data was 
transmitted. 

[0015] In the subject system a single summarizing 55 
version vector is sent from a first location to a second 
location, and all the updates are then returned all at 
once to the first location. In one embodiment, the sum- 



marizing version vector is examined at the second loca- 
tion to determine if a whole object or a portion of the 
object needs to be sent to the first location. This is 
accomplished by either defining a version vector to a 
whole object or defining a version vector to the base of 
an object and the update stamp for each of its differen- 
tial updates. 

[0016] As used herein the base of an object refers to 
an object in its initial form absent any changes reflected 
by differential updates. 

[0017] Next the received summarizing version vector 
is compared with either the above version vectors of the 
individual objects or the update stamps for individual dif- 
ferential updates. As noted above, the update stamp of 
a differential update consists of the identifier of the 
object container which generated it and the time stamp 
when it was generated. 

[0018] All of the whole objects or differential updates 
are only sent to the first location if their version vectors 
or updated time stamps are either newer than the 
received summarizing version vector or conflicts with it, 
thus eliminating the unnecessary version vector transfer 
associated with prior art systems. Thus the summariz- 
ing version vector system eliminates repeated transfer 
of version vectors associated with individual objects. 
[0019] Another use for the summarizing version vector 
is as follows. It will be noted that in the past, synchroni- 
zation systems have operated on an all or nothing basis, 
meaning that if the synchronization fails before comple- 
tion, then all of the data must be present. In the subject 
In the subject system, when an update is sent from a 
first server to a second server, both the version vector of 
the corresponding object and the summarizing version 
vector in the second sever will be updated right after it 
successfully receives the update. Therefore, when syn- 
chronization falls, the synchronization will be restored 
without resending the updates which were already 
received by the second server in the previous synchro- 
nization by comparing the first server's summarizing 
version vector with the second server's updated version 
vector. This offers more fine grain synchronization and 
is therefore more fault tolerant. This is because in the 
subject system the unit of transmitted data may be a dif- 
ferential update, called an atom because of its small 
size. This is distinguished from the prior art systems 
which must transmit the whole object as the unit of 
transmitted data. If more than two servers are involved, 
the synchronization may be restored between the first 
server and the second server after the point of failure if 
in the meantime a third server has synchronized with 
the second server between the time of the failure and 
the time of the restart. The above is made possible by 
the use of the summarizing version vector. 
[0020] It will be appreciated that the object in a con- 
tainer may be any object, for instance a document, a 
program, or a row of a table in a relational database, 
making the subject system a universal system. This 
integrates the synchronization process for various forms 
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of data and is made possible by the separation of the 
semantics of objects from the synchronization. This is in 
contrast to existing systems which are dependent on 
the form of the data. 

[0021] As one important aspect of the subject inven- 5 
tion, the subject system permits synchronization 
between different systems such as synchronizing 
between a Word document, a McDraw document and a 
FrameMaker document because the semantics of the 
data may be separated from the synchronizing. This is 10 
accomplished by extracting updates in a standard for- 
mat and synchronizing based on the standard protocol. 
[0022] In operation, an update at one location in one 
format is converted to an update in the standard proto- 
col. The update in the standard protocol is transmitted is 
to a second location where it is converted to the format 
of the object at the second location. 
[0023] Moreover, the subject system uses differential 
synchronization to minimize the data transfer to only 
that data which is changed between versions of objects. 20 
As a special feature of the subject invention, the system 
switches from whole object synchronization to differen- 
tial synchronization when, for instance, local memory is 
full or when disc space is limited. Additionally when an 
object at one location is very old, if differential synchro- 25 
nization is being used, it is convenient to update the 
entire or whole object at this second location by auto- 
matically shifting to whole object synchronization. 
[0024] It is a further feature of the subject invention 
that synchronization from a mobile computer can be 30 
done either In the client/server mode from secondary 
server to primary server, or can be done amongst the 
secondary servers, either peer-to-peer or in a hierarchi- 
cal system, thus permitting mobile-to-mobile synchroni- 
zation. For example, if one secondary server is a user's 35 
desktop computer and another secondary server is a 
mobile device for secondary use, then data synchroni- 
zation between the two secondary servers is of a client- 
server type, while the data synchronization of two 
mobile devices for instance owned by two salesmen for 
exchanging each other's sales data is of the peer-to- 
peer type. As a result of the fact that the subject system 
supports both peer-to-peer and hierarchical structures, 
the subject system will support any topology of the sec- 
ondary servers. 

[0025] Because oftentimes a server is only concerned 
with data from a selected number of servers, it is unnec- 
essary to synchronize with all of the servers in the sys- 
tem. In a system which has a large number of servers, 
only some of which have data which one wishes to syn- 
chronize, if one were to attempt to keep track of all 
objects and all updates, memory would be quickly 
exhausted. 

[0026] In order to solve this problem in one embodi- 
ment of the subject invention, a latest common ancestor 
version vector is utilized to selectively purge updates 
and version changes at a selected group of servers 
which are older than or equal to this latest common 



ancestor version vector by purging off differential 
updates or deleted objects which have propagated to 
the group of the servers in question, e.g. the selected 
servers. 

[0027] Ideally, a server should purge off differential 
updates or deleted objects only if all of the servers in a 
system have received the updates. However, because 
the number of servers, especially the secondary serv- 
ers, dynamically changes, and some servers may be 
disconnected from the whole system for a very long 
time, purging off updates only after all the servers 
received them is impractical and the memory or disk 
space may be exhausted even before it can actually 
purge off updates. 

[0028] Therefore, in the subject system, the system 
allows a server to select a set of servers and purge off 
updates if the updates have already propagated to all of 
the selected servers. 

[0029] Furthermore, a selected server may also be 
excluded in the consideration of purging off of updates if 
it is not synchronized with the selecting server for a time 
period in excess of a pre-set limit. 
[0030] In the subject system, this purging off of 
updates is done by utilizing the latest common ancestor 
version vector. When a server synchronizes with any of 
the selected servers, it will record the selected servers 
versions vector locally. The latest common ancestor ver- 
sion vector is then calculated from its own version vec- 
tor and all of the selected server's version vector. Any 
differential updates or deleted objects' information will 
be purged off if their update stamps or version vectors 
are newer than or equal to the calculated latest common 
version vector. 

the synchronization algorithm as shown below uses 
both summarizing version vectors and update stamps to 
minimize the data transmitted. During synchronization, 
the subject algorithm goes through the below listed 
steps in order to realize the propagation of updates from 
one server to another: 



40 

1) The first server sends its summarizing version 
vector to the second server. 

2) Upon receiving the summarizing version vector 
45 of the first server, the second server sends to the 

first server its summarizing version vector, followed 
by all of the identifiers of objects which exist in the 
second server and that can support differential syn- 
chronization. 

50 

3) Upon receiving the summarizing version vector 
and identifiers from the second server, the first 
server figures out all of identifiers of objects which 
need to be received as whole objects and sent to 

55 the second server. The sub-steps for this step are 
as follows: 

a) Find out all objects which exist in the second 
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server but do not exist in the first server and put 
their identifiers into the list of objects needed to 
be received from the second server. This is 
done by checking if any of the received identifi- 
ers of objects matches the identifier of any 5 
object existing in the first server. 

b) Find out all objects which exist in both serv- 
ers that can support differential synchroniza- 
tion . Then check if any newer than the common to 
version vector of the two servers, where base 
version vector of an object refers to the version 
vector of the object absent any differential 
updates and where the common version vector 
refers to a version vector reflecting the state is 
from where the two servers' summarizing ver- 
sion vectors diverged. All of the identifiers of 

the objects which passed the check are also 
added to the list of objects needed to be 
received from the second server. 20 

c) Send all the identifiers obtained from sub- 
steps a) and b) to the second server. 

4) Upon receiving the identifiers from its first server, 2s 
the second server compares its latest common 
ancestor version vector with the summarizing ver- 
sion vector of the first server. If the summarizing 
version vector of the first server is older than the 
second server's latest common ancestor version 30 
vector, it extracts all the individual whole objects it 
contains. Otherwise, it proceeds to compare its 
summarizing version vector with the first server's 
summarizing version vector. If the second server's 
summarizing version vector is newer than the first 35 
server's summarizing version vector, the second 
server will scan all the objects existing in it. If the 
identifier of an object appears in the list of identifiers 
received from the first server, it is sent to the first 
server as a whole. If an object is the one needed to <to 
be sent as a whole because it does not support dif- 
ferential synchronization and its version vector is 
either newer or conflicting with the first server's 
summarizing version vector, this object will also be 
sent to the first server. If an object is the one which 45 
supports differential synchronization and its base 
version vector is newer than the first server's sum- 
marizing version vector, this object will be sent to 
the first server as a whole too. Finally, if an object is 
the object which supports differential synchroniza- so 
tion and has updates which stamps are either 
newer than or conflicting with the first server's sum- 
marizing version vector, these updates will be sent 
to the first server as the differential updates. The 
order of all of the objects or updates being sent is 55 
determined first by the ordering of version vectors 
or the associated update stamps. If there is a dis- 
crepancy in the ordering, then a secondary order- 



ing process is invoked. 

5) After finishing step 4, if the first server is one of 
the second server's selected servers, the second 
server will store or update the first server's summa- 
rizing version vector it has stored previously. It may 
also recalculate its latest common ancestor version 
vector if all of the selected servers' summarizing 
version vectors have been stored in the second 
server, and accordingly purges off all the deleted 
object's information or differential updates whose 
version vectors or update stamps are older than or 
equal to the latest common ancestor version vector. 

6) Upon receiving each object and update from the 
second server, the first server updates the informa- 
tion in it as follows: 

a) If the received object or update has a version 
vector or time stamp older than or equal to the 
version vector of the corresponding object in 
the first server, this object or update will be 
thrown away. 

b) If a whole object was received and its ver- 
sion vector is newer than the version vector of 
the corresponding object in the first server, the 
object in the first server will be replaced with 
the received object, and accordingly the 
object's version vector and the first server's 
summarizing version vector will also be 
updated. 

c) If a whole object was received and its version 
vector is conflicting with the version vector of 
the corresponding object in the first server, the 
two versions of the object will be reconciled by 
calling a reconcile method on the object in the 
first server, with the received object as the 
argument. As the result, a new object will be 
generated to replace the object in the first 
server, and accordingly, a new version vector 
for the new object is generated and the sum- 
marizing version vector of the first server is 
updated. 

d) If a differential update was received and its 
update stamp and the received second server's 
summarizing version vector indicate it is not yet 
contained in and was generated after the corre- 
sponding object's differential updates in the 
first server, the received updates will be added 
to the top of the object's differential updates in 
the first server. Accordingly, the summarizing 
version vector of the first server will be 
updated. 

e) If a differential update was received and its 



5 



9 



EP 0 974 895 A2 



10 



update stamp and the received second server's 
summarizing version vector indicate it is not yet 
contained in and was not generated after the 
corresponding object's differential updates in 
the first server, the differential update(s) in the 5 
first server that were generated after the com- 
mon state of the two servers will be extracted, 
and both the extracted differential updates and 
the received differential update will be passed 
as arguments to the reconcile method on the w 
object in the first server in order to reconcile the 
conflict between them. As a result, the object in 
the first server will be replaced with a new 
object or a new update will be added to the 
object's differential updates in the first server, 15 
and accordingly a new version vector for the 
new object or the update stamp for trie new 
update will be generated and the summarizing 
version vector of the first server will be 
updated. 20 

7) After finishing step 6, if the second server is one 
of the first server's selected servers, the first server 
will store or update the second server's summariz- 
ing version vector it has stored previously. It may 25 
also recalculate its latest common ancestor version 
vector if all of the selected servers' summarizing 
version vectors have been stored in the first server, 
and accordingly purges off all the deleted object's 
information or differential updates whose version 30 
vectors or update stamps are older than or equal to 
the latest common ancestor version vector. 

8) After finishing step 7, the update propagation 
from the second server to the first server is com- 35 
pleted. If a two way update propagation is required 
then steps 1 ) to 7) can be repeated by replacing the 
first server with the second server and vice versa. 



by secondary servers, economical synchronization can 
be achieved. Further, since some applications do not 
permit automatic synchronization, user control of syn- 
chronization which prevents unintended synchroniza- 
tion is critical. 

[0033] A second aspect of the subject invention is the 
use of a summarizing version vector. This summarizing 
version vector is used to minimize the amount of data 
transmitted in the synchronization process by avoiding 
the necessity for exchanging version vectors for individ- 
ual objects. It can also further reduce data to be trans- 
mitted by working with update stamps associated with 
individual differential updates to permit differential syn- 
chronization. A latest common version vector can be 
constructed from a server's summarizing version vector 
and the summarizing version vectors of a set of its 
selected servers, which is used to reduce differential 
updates on the server. Additionally, the summarizing 
version vector permits the re-start of synchronization 
from the point of previous failure with data from an unaf- 
fected server. It also supports fine grain synchronization 
by permitting a differential update as the atom of data to 
be transmitted. 

[0034] As a significant feature of the subject invention, 
the system will switch between whole object synchroni- 
zation and differential synchronization automatically. 
For instance in memory management it is oftentimes 
desirable to switch from differential synchronization to 
whole object synchronization due to the memory inten- 
sive nature of differential processing. Also when doing 
whole object synchronization it may be beneficial to 
switch to differential synchronization to minimize the 
data to be transmitted. 

[0035] Last but not least, the subject system permits 
a synchronization between different systems because 
the semantics of the data is separated from the syn- 
chronization, making the system universal as to the for- 
mat of the data. 



[0031] As can be seen from the description above, the 
subject algorithm has two advantages compared with 
the existing algorithms. First, it can avoid transmitting 
per-object version vectors from one server to another. 
Secondly it can transmit the differential updates which 
are incorporated in one server, but not incorporated in 
another server. An additional advantage of the subject 
algorithm is that it offers finer grained synchronization, 
and is more fault-tolerant. 

[0032] In summary, a universal system is provided for 
synchronizing servers which accommodates wide area 
mobile computing while at the same time making the 
synchronizing process more efficient. First, the intro- 
duction of multiple primary servers with high perform- 
ance reliable links ensures high data availability and 
scaleability. Secondly, the introduction of the secondary 
servers makes the subject system well suited for mobile 
computing which may use unreliable and expensive 
links. Moreover, with the control of the synchronization 



40 BRIEF DESCRIPTION OF THE DRAWINGS 

[0036] These and other features of the subject inven- 
tion will be better understood with reference to the 
Detailed Description taken in conjunction with the Draw- 
45 ings, of which: 

Figure 1 is a diagrammatic representation of the 
system architecture of the subject inven- 
tion illustrating the primary server/second- 
ary server structure; 
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Figure 2 



is a diagrammatic representation of the 
architecture of the server for the system of 
Figure 1 showing an object container 
manager, a synchronization manager, 
synchronizers and object containers; 



Figure 3 is a diagrammatic representation of the 
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major subroutines for each of the compo- 
nents shown in Figure 2; 

Figure 4 is a block diagram of the structure of the 
object container of Figure 2, showing the 
summarizing version vector used therein, 
the latest common ancestor version vec- 
tor and the structure of the differential 
based log used in the subject system; 

Figure 5 is a diagrammatic representation of the 
synchronization of two servers illustrating 
the utilization of summarizing version vec- 
tors; 

Figure 6 is a block diagram illustrating a process for 
modifying objects in an object container; 

Figure 7 is a diagrammatic representation of a syn- 
chronization protocol showing the 
exchange of data between first and sec- 
ond servers; 

Figure 8 is a block diagram of the system for 
extracting updates to be transmitted from 
a first server to a second server utilizing 
summarizing version vectors and a differ- 
ential update log for the second server; 

Figure 9 is a block diagram showing a process for 
detecting and resolving conflicts between 
two servers; 

Figure 10 is a block diagram showing a process for 
the purging off of update logs at a server; 
and, 

Figure 11 is a block diagram illustrating the switch- 
ing of the synchronization system from a 
whole object synchronization process to a 
differential synchronization process based 
on predetermined characteristics of the 
system. 

DETAILED DESCRIPTION 

[0037] Referring now to Figure 1 , the system architec- 
ture involving primary and secondary servers is illus- 
trated. Here a network 10 of primary servers 12 is 
interconnected through the utilization of high perform- 
ance links 14 which are generally orders of magnitude 
more reliable than wireless links. It is the purpose of the 
high performance links to ensure synchronization integ- 
rity and to permit connection of secondary servers to 
the primary servers. 

[0038] As illustrated, a number of secondary servers 
16 are either linked to each other as illustrated at 1 8 in 
a peer-to-peer relationship; or are linked as illustrated at 



20 in a hierarchical manner. As discussed hereinbefore, 
it is this primary server/secondary server architecture 
which permits reliable synchronization with virtually any 
type of system, be it a peer-to-peer system or a hierar- 

5 chical system. It will be appreciated that because of the 
primary server/secondary server architecture, any type 
of system can be accommodated, thus the subject sys- 
tem can accommodate any topology amongst the sec- 
ondary servers. 

10 [0039] Referring now to Figure 2, in one embodiment, 
the architecture of the server of the subject system 
includes a number of object containers 22 and an object 
container manager 24 coupled thereto. The object con- 
tainer manager is coupled to a synchronizer manager 

15 26 which is in turn coupled to object containers 22 and 
synchronizers 28. A protocol utility 30 is driven by syn- 
chronizer manager to select the most reliable connec- 
tion to the network. 

[0040] In operation, a system utility or application ini- 

20 tiates synchronization from either the object container 
or the synchronizer manager. Synchronizer manager 26 
consults with utility 30 to open a reliable connection 
between two servers to be synchronized. Thereafter, 
synchronizer manger 26 creates a synchronizer such as 

25 synchronizer 28 based on the result from the protocol 
utility. Then the synchronizers on the two servers will ini- 
tiate the synchronization process. 
[0041] Referring now to Figure 3, the subroutines or 
methods utilized in the system of Figure 2 are now 

30 described. It will be appreciated that object container 22 
utilizes a put method 32 a delete method 34, a get 
method 36, a synchronize method 38, an applyUpdate 
method 40, a generateUpdates method 42, a purgeOff- 
History method 44 and a getSummarizingVersion Vector 

35 46. These methods are written in Java and can be found 
in the source code hereinafter. 
[0042] As can be seen, object container manager 24 
is provided with an openObjectContainer method 50, 
whereas synchronizer 28 is provided with start method 

40 52, pull method 54, push method 56 and stop method 
58, all written in Java and can be found in the source 
code hereinafter. 

[0043] Synchronizer manager 26 is provided with a 
synchronizermethod 60, whereas protocol utility 30 is 

45 provided with an isConnected method 62 and a get- 
BestConnection method 64 which are standard. 
[0044] Referring now to Figure 4, the data structure of 
the subject system is shown for use with object con- 
tainer 22. Of importance is the structure of the summa- 

50 rizing version vector, here illustrated at 70 which is used 
throughout the subject invention. As can be seen, the 
summarizing version vector of object container 22 has a 
field for identification 72 and a field for a time stamp 74. 
It will be appreciated that Sid, refers to object container 

55 1 in the system, whereas the number 1 544 refers to a 
time stamp associated with object container 1. The 
summarizing version vector in essence contains the 
identifiers and the time stamps for all of the objects in 
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the respective containers. As shown at 80, a similar 
summarizing version vector for a remote object con- 
tainer is shown at 76. It will thus be appreciated that 
there are summarizing version vectors at a number of 
locations throughout the system. 5 
[0045] As illustrated at 80 there is a specialized ver- 
sion vector called the latest common ancestor version 
vector which serves to identify that differential updates 
which can be read off of the object container because a 
set of remote object containers predefined or prese- 
lected containers which already have received common 
differential updates. This permits purging off of certain 
updates because they are already stored in the prede- 
fined set of object containers. 
[0046] As illustrated at 82, a log structure is shown 
which contains the history of updates to the object con- 
tainer 22. As can be seen, the log contains a field 84 
having a reference to either the object version vector 86 
or to the update stamps 88, and a field 90 having a ref- 
erence to the corresponding object 92 or differential 
update 94. It will be appreciated that object 96 associ- 
ated with update 94 contains an additional update 98, 
both of which have updated the initial object, here illus- 
trated as base 100. 

[0047] Object 92 is the object which must always be 
sent as a whole object, whereas Object 96 can be in 
terms of differential updates to a base. 
[0048] Referring now to Figure 5, what is shown is a 
system to permit the change of data in the object con- 
tainer during synchronization. As can be seen, an 
object container 1, here illustrated at 112 is provided 
with summarizing version vector VN^ of server 1 as 
illustrated at 1 1 1 Also provided to object container 1 1 0 
are summarizing version vectors OW ( , of servers other 
than server 1, here illustrated at 114. The latest com- 
mon ancestor version vector, CW, of server 1 as illus- 
trated at 116 is also supplied to object container 110. 
Likewise, update log 1 of server 1 as illustrated at 1 18 is 
also applied to object container 110. 
[0049] Object container 1 1 0' illustrates a dynamically 
changing state for object container 110 having the indi- 
cated inputs, 1 1 2'-1 1 8'. The dynamically changing state 
of object container 1 10 occurs as a result of object con- 
tainer 120 having been dynamically changed as illus- 
trated at 120' which sends data to object container 110 
as illustrated by arrow 1 20. It will be appreciated that the 
synchronization process starts by object container 110 
sending summarizing version vector 1 as illustrated at 
arrow 122 to the server associated with object container 
2, here illustrated at 120. Object container 120 is 
changed so that it contains synchronizing information 
supplied by summarizing version vector 1 so that it in 
turn updates object container 110 throughout informa- 
tion sent as illustrated by arrow 124. 
[0050] As will be appreciated, object container 120 
has as inputs the summarizing version vector VV 2 of 
server 2 illustrated at 126, the summarizing version vec- 
tors of 0VV 2 of servers other than server 2 as illustrated 



at 128, the latest version vector of CVV 2 of server 2 as 
illustrated at 130 an update log 2 of server 2 as illus- 
trated at 132. 

[0051] The information sent, as illustrated by arrow 
122, is the summarizing version vector 2, plus identifier 
of objects in object container 1 20 which support differ- 
ential synchronization. 

[0052] The synchronization continues as illustrated by 
arrow 134, with information going from object container 
1 10" to object container 120". The information transmit- 
ted as illustrated by arrow 134 are identifiers of objects 
which supports differential synchronization absent the 
objects of container 1 , but which objects exist in object 
container 2. This can be ascertained by comparing the 
received identifiers of objects from object container 120' 
as illustrated by arrow 124 with the identifiers of objects 
in object container 1 . Note that the summarizing version 
vector is not sent at this point because it has been pre- 
viously sent in the first stage of the synchronization 
process. 

[0053] With the transmission of information as illus- 
trated by arrow 124, it is possible for object container 
120" to be purged off information which is stale. The 
purging off of historically stale information is accom- 
plished by access to the latest common version vector 
CW2', here illustrated at 130'. 
[0054] Thereafter, updates are sent from object con- 
tainer 120" to object container 110" at which point, a 
purging operation can take place. After all of the purging 
operations take place, the synchronization is completed 
such that object container 110"' will contain the most 
updated version of all of the objects in the system. It will 
be appreciated that the objects in object container 120 
can be synchronized and updated in the same manner. 
[0055] Referring now to Figure 6, what is illustrated is 
a system for modifying an object in an object container. 
Here, an object container 140 is supplied with a summa- 
rizing version vector 142 and an update log 144. The 
modification step, as illustrated at 146, requires the 
input of a different summarizing version vector, here 
illustrated at 1 48, along with its associated update log, 
here illustrated at 150. What this diagram illustrates is 
that if an object container 140 is modified, then it is nec- 
essary to modify the corresponding summarizing ver- 
sion vector and its corresponding update log in order to 
record a modification. 

[0056] Referring now to Figure 7, what is illustrated is 
a synchronization protocol. The synchronization proto- 
col involves two synchronizers with one as the synchro- 
nization initiator and another as a synchronization 
responder. The synchronizer initiating the synchroniza- 
tion first sends a pull/push request to the synchroniza- 
tion responding to the synchronization which is ready to 
listen to the synchronization initiator. Thereafter, this 
responding server sends a summarizing version vector 
back. Then a further summarizing version vector is 
transmitted in the opposite direction along with identifi- 
ers of objects which support differential synchroniza- 
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tion. Next, identifiers of objects are sent back in the 
opposite direction coming with the result being the 
sending and update to the server responding to the ini- 
tial synchronization request. 

[0057] The result of the synchronization protocol is 
that the server initiating the synchronization is provided 
with an update corresponding to the latest common 
ancestor version vector, with all differential updates 
purged off. The same is true for the server responding 
to the synchronization request. 
[0058] Referring now to Figure 8, what is described is 
a method for extracting updates to be transmitted. Here, 
a server 160 has an associated summarizing version 
vector 162 which contains the aforementioned fields 72 
and 74. The highlighted field indicates a time stamp 
which is earlier in time than the corresponding value in 
the summarizing version vector of a second server, 
server 164. This is illustrated by highlighted portion 166 
of summarizing version vector 168 associated with 
server 164. 

[0059] The updates are extracted by checking the ver- 
sion vectors or update stamps in the update log 170 of 
server 164. Note these version vectors or update 
stamps are the ones found at 86 in Figure 4. The check- 
ing process is accomplished in one embodiment as fol- 
lows: Any object or differential update with a version 
vector or update stamp that includes at least one time 
stamp greater than the highlighted corresponding time 
stamp in summarizing version vector 1 62 and less than 
or equal to the highlighted corresponding time stamp in 
summarizing version vector 1 68 is extracted. 
[0060] Referring to Figure 9, a method for detecting 
and resolving conflicts is shown in which a server 180 
has a corresponding summarizing version vector . An 
update log 1 84 is coupled to server 1 80 and contains an 
object 186 which contains three differential updates 
188, 190 and 192 on top of its base 194. As to the sec- 
ond server, server 196, this server is provided with a 
summarizing version vector 198 and a corresponding 
update log 200, server 196 contains an object 202 
which is the same as object 1 86, with the exception that 
it has different differential updates indicating a conflict 
with 186. This is shown by comparing the highlighted 
portions 204 and 206 with highlighted portion 188. The 
problem is how to figure out why updates 204 and 206 
are in conflict with update 186. This conflict detection is 
accomplished by comparing the version vectors or 
update stamps of the whole object, not shown in this fig- 
ure, or the differential updates 188, 190, 192 in the 
server 1 80 and the differential updates 204, 206, 208, 
210 in server 196 with the common version vector 212 
of the summarizing version vector 182 and 198. The 
common version vector of the two summarizing version 
vectors reflects a common state for the two servers. If 
there are differential update(s) of objects 1 86 and 202 in 
both the servers 180 and 196 that have a time stamp 
greater than the corresponding time stamp in the com- 
mon version vector 212, then two objects 186 and 202 



are in conflict. 

[0061] After the objects 1 86 and 202 have been found 
to be in conflict, the conflict is resolved or reconciled by 
calling a predetermined reconcile method and passing 

5 the differential updates in conflict to the method as 
shown at 220. It will be noted that the reconcile method 
is usually determined based on a specific application. 
[0062] Referring now to Figure 1 0, one method of a 
purging off update logs is shown. Here, a server 260 

w has a summarizing version vector of itself, illustrated at 
260, and the summarizing version vectors 262 and 260 
from other servers. The latest common version vector 
266 of the server 230 is calculated from all the three 
summarizing version vectors 230, 232, and 234 which 

is represents the common state of all three servers. Hav- 
ing obtained the latest common version vector 236, it is 
possible to purge off the corresponding update log 
entries in the update log 238. This means that log 
entries bracketed by bracket 240 may be deleted or 

20 purged. This is because these updates have an update 
stamp which is older than the latest common version 
vector 236, reflecting the fact that these updates have 
already been incorporated in all the above three serv- 
ers. 

25 [0063] Referring now to Figure 11, it is possible to 
switch between whole object synchronization and differ- 
ential synchronization. Here, whole object synchroniza- 
tion is shown at 150, whereas differential 
synchronization is shown at 1 52. A switch 254 is used to 

30 determine which synchronization process will be used. 
Switch 254 is under the control of logic 256 which oper- 
ates as follows. 

[0064] In one embodiment, whether an object which 
can support differential synchronization will be synchro- 

35 nized by sending it as a whole object or the portion of its 
differential updates is determined by the following logic. 
First, if the object only exists in a first server and is 
absent from a second server, then the whole object 
needs to be sent from the first server to the second 

<to server when the two servers synchronize with each 
other. Secondly, if the object exists in the two servers 
and if at least at one server the state of the base of the 
object is newer than the common state of the object at 
the two servers, then no differential synchronization on 

45 the object can be done between the two servers 
because the object at either of the two servers can not 
be synchronized up-to-date after receiving only differen- 
tial updates from the other server. This requires switch- 
ing to whole object synchronization. Otherwise, 

so differential updates can be realized between the two 
servers, and the system can be switched to differential 
synchronization. 

[0065] Other factors, if needed, can be added to the 
logic for the switch between whole object synchroniza- 
55 tion and differential synchronization. For instance, a 
switch from differential synchronization to whole object 
synchronization may be triggered if the free space of 
memory is close to being exhausted, or if it is found to 
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be more efficient to send a whole object other than its 
differential updates. Alternatively, a switch from whole 
object synchronization to differential synchronization 
can be determined to minimize the data to be transmit- 
ted. In other words, it may be desirable to switch to dif- 5 
ferential synchronization when it is more efficient to 
send an object's differential updates as opposed to the 
whole object. Note that the switch between whole object 
synchronization and differential synchronization is 
determined on the basis of individual objects. 10 
[0066] More particularly, the data in the subject sys- 
tem can be files, documents, databases or the mix of 
part or all of these and they will be appropriately ground. 
The way data are grouped can change from application 
tP application. For example, in file systems, a directory 15 
can be considered as a file group and it may further 
define structures inside to organize data (files). In object 
oriented databases, a data group possibly includes all 
the homogeneous or heterogeneous objects needed to 
meaningfully support its applications. As the first step, 20 
we assume that all the data groups in different devices, 
which need to be synchronized with each other, eventu- 
ally contain exactly the same data set. 
[0067] The subject system has been implemented 
using Java. Therefore, all descriptions below are based 25 
on Java's terminology. An object container is a container 
which stores and maintains a Java object group. 
Objects in the object container are represented by Syn- 
chronizable objects which are any Java Serializable 
objects providing a replace/apply method and a 30 
reconcile method. Synchronizable objects are the 
abstraction of concrete data such as files, documents, 
or rows of database tables. Assuming that a full set of 
(Synchronizable) objects already exist in an object con- 
tainer, applications can then access objects in the 35 
object container by calling its get or put methods. 
[0068] In a server, multiple object containers can be 
created by calling the object container manager's 
openObjectContainer method. In each server, only one 
object container manager can exist and is used to man- 40 
age all the object containers at the server. In addition to 
creating object containers, it also offers services such 
as listing or deleting the created object containers. For 
convenience, each object container will be assigned a 
globally unique identifier when created which can be 45 
used when referencing that object container in the sub- 
ject system. In one embodiment, the object container 
identifiers can be assigned to the hash codes which are 
generated by a one-way hash function with the URLs of 
the object containers as the input to it. 50 
[0069] Synchronizers are the key component in the 
subject system in order to bring any two object contain- 
ers into a consistent state. There can be multiple syn- 
chronizers in each server, where each of the 
synchronizers uses a different type of communications 55 
transport and protocol. For example, synchronizer 1 
may use TCP/IP and a two-way synchronization proto- 
col while synchronizer 2 may use infrared and a one- 



way protocol which is more suited for unreliable wireless 
communications. A synchronizer is dynamically created 
by the synchronizer manager when needed and will be 
self-destroyed just after it has finished the synchroniza- 
tion. The synchronizer manager in a server can either 
bind an object container which is going to be synchro- 
nized with a synchronizer and trig the synchronizer to 
start the synchronization, or listen to one or more syn- 
chronization requests from the synchronizer managers 
at remote servers. The protocol utility can help the syn- 
chronizer manager adaptively select a synchronizer to 
be generated that matches the best connection availa- 
ble currently. The protocol utility is one of the classes 
provided by the Java extension package. 
[0070] As can be seen from the above description, the 
object semantics, such the format it uses to represent 
its contents and the algorithms it uses to reconcile con- 
current update conflicts, is separated from the synchro- 
nizer. This feature leads to the ability that the subject 
system may synchronize between different application 
systems. 

[0071] Assume server 1 is specified to synchronize 
the object container 1 at it with the object container 2 at 
the server 2. Further, assume all the components bun- 
dled to object container i (i = 1 ,2) are annotated as syn- 
chronizer manager i, synchronizer i, etc. The detailed 
steps for the whole synchronization process are as fol- 
lows: 

1) Synchronizer manager 1 opens a communica- 
tion channel to the synchronizer manager 2 which 
has been listening to synchronization requests. 

2) Synchronizer manager 1 sends the synchroniza- 
tion request to synchronizer manager 2 which indi- 
cates the object container to be synchronized with 
and the synchronizer to be used in the synchroniza- 
tion. 

3) Synchronizer manager 1 and 2 create and con- 
nect the pair of matching synchronizers 1 and 2 
with synchronizer 1 as the synchronization initiator 
and synchronizer 2 as the listener. 

4) Synchronizer 1 sends a pull or push update com- 
mand to synchronizer 2. The pull update command 
means that object container 1 requires the receipt 
of updates from object container 2, whereas the 
push update command means that object container 
1 will propagate updates from it to object container 
2. Whether synchronizer 1 should send the pull or 
push update command or both in a serialized man- 
ner is determined by the system utility or application 
which initiated the synchronization. The following 
steps assume that synchronizer 1 sent a push 
update command to synchronizer 2. When syn- 
chronizer 1 has sent a pull update command to syn- 
chronizer 2, synchronizer 2 sends a confirmation 
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message back to synchronizer 1 and the rest of the 
steps will be the mirror image of the foregoing 
steps. 

5) Synchronizer 1 obtains the summarize version 5 
vector of object container 1 by calling the getSum- 
marizingversionVector method as shown in Figure 

3 on object container 1 , and sends the summarizing 
version vector to synchronizer 2. The summarizing 
version vector of object container 1 reflects the cur- 10 
rent state of object container 1 . 

6) Synchronizer 2, when needed, records the 
received summarizing version vector of object con- 
tainer 1 in object container 2. Synchronizer 2 then is 
obtains the summarizing version vector of object 
container 2 and the identifiers of objects in object 
container 2 that support differential synchronization 

by calling the getSummarizingVersionvector 
method which is shown in Figure 3 and by calling 20 
the entries method described in the source code 
presented hereinafter on object container 2. The 
summarizing version vector of object container 2 
reflects the current state of object container 2. Syn- 
chronizer 2 then sends the summarizing version 25 
vector and identifiers obtained above to synchro- 
nizer 1 . 

7) Synchronizer 1, when needed, records the 
received summarizing version vector of object con- 30 
tainer 2 in object container 1 . Synchronizer 1 then 
obtains the identifiers of the objects which have to 

be received as a whole object by calling the get- 
WholeObjectlds method described in the source 
code presented hereinafter on object container 1 35 
with the received summarizing version vector of 
object container 2 and the received identifiers as 
the arguments. The identifiers obtained above 
include the identifiers of objects which exist in 
object container 2 but miss from object container 1 40 
and the objects the state of the base of which are 
newer than the common state of object container 1 
and object container 2. Synchronizer 1 then sends 
the identifiers obtained above to synchronizer 2. 

45 

8) Synchronizer 2 obtains the updates which have 
to be sent to object container 1 by calling the gener- 
ateUpdates method as shown in Figure 3 on object 
container 2 with the received identifiers of objects 
and the summarizing version vector of object con- so 
tainer 1 received in step 6 as arguments. Each of 

the updates obtained above may be a whole object 
with its version vector or a differential update with 
its update stamp. Synchronizer 2 then sends the 
updates obtained above to synchronizer 1 one by 55 
one. 

9) Synchronizer 1 applies each of the received 



updates to object container 1 by calling the apply- 
Update method as shown in Figure 3 on object con- 
tainer 1 . 

1 0) After finishing the sending or receiving updates, 
both synchronizers 1 and 2 may call the purgeOff- 
History method as shown in Figure 3 on object con- 
tainer 1 and object container 2, respectively, in 
order to purge off the differential updates which 
have been contained in all the object containers 
object container 1 or object container 2 is con- 
cerned with. 

[0072] The applyUpdate method of an object con- 
tainer will further call one of the methods defined in an 
object in order to replace an object with the received 
object, reconcile between an object with the received 
object, add the received differential update to an object, 
or reconcile the received differential update with the dif- 
ferential update(s) of an object which are in conflict with 
the received differential update. It will also update the 
summarizing version vector of the object container to 
reflect the fact that the object container has already 
placed the update into it. 

[0073] As can be seen from the description above. 
The summarizing version vector deployed in the subject 
system is not only used for the detection of concurrent 
update conflicts, but also for reducing the number of 
entries in an update log. 

[0074] The summarizing version vector of an object 
container is in effect the summary of the version vectors 
of all the objects and the update stamps of all the differ- 
ential updates in the object container. For an object 
which cannot support differential synchronization, the 
version vector of the object reflects the current state of 
the whole object, whereas for an object which can sup- 
port differential synchronization, the version vector of 
the object reflects the current state of the base of the 
object, absent all the differential updates associated 
with it. Much like the version vector for a whole object, 
the version vector for the base of an object also dynam- 
ically changes as part or all of the differential updates 
associated with it are purged off. Therefore, the version 
vector of an object characterizes the current state of the 
whole object or the base of the object. 
[0075] The summarizing version vector of an object 
container or the version vector of an object is the vector 
of update stamps. An update stamp in turn consists of 
two fields: the first field is the identifier of an object con- 
tainer which has made modification(s) to an object if this 
is the version vector of the object or to object(s) in an 
object container if this is the summarizing version vector 
of the object container; the second field is the time 
stamp generated by the modifying object container cor- 
responding to the first field when it made the last modi- 
fication to the object or objects. For convenience, an 
update stamp is represented by (sidj, t|) where sidj is the 
first field of the update stamp, and t] is the second field 
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of the update stamp. The time stamp in an update 
stamp can be real time, an update sequence number, or 
whatever is meaningful to the application. A series of 
time stamps generated by an object container must be 
consistent in meaning and strictly monotonic while dif- 5 
ferent object containers may have different semantics 
with the time stamps they generate. This is very impor- 
tant when synchronizing between object containers 
across different application systems because it is 
impractical, to require different application systems gen- 10 
erating time stamps with the same semantics. 
[0076] Using the update stamp representation above, 
the summarizing version vector of an object container or 
the version vector of an object can be represented as 
{(sid,, t,), • • • , (sidi, tj), • • • ,(sid n , t n )} or {(sid h tj) I i = is 

1 , 2 n}. The semantics of the summarizing version 

vector of an object container and the version vector of 
an object are similar, except that the version vector of an 
object only reflects that object's current state in its con- 
taining object container other that the current state of 20 
the containing object container. Due to the similarity, 
only the summarizing version vector will be described in 
detail in the following. The version vectors of objects will 
be mentioned only when needed. 

[0077] The i-th component of the summarizing version 25 
vector of an object container means that the object con- 
tainer has contained all modifications made by the 
object container with the identifier sid, up until the time 
indicated by the time stamp tj. For convenience, the 
summarizing version vector of object container j will be 30 
represented as w', and its i-th update stamp as Wji. 
[0078] The size n of the summarizing version vector 
w' of object container j means that object container j-th 
has contained all the modifications to the objects in it 
made by the object containers with the identifiers sid,, 35 
sid 2 , sid n up until the time stamps t 1f t 2 , ... t n respec- 
tively. Based on the descriptions above, several func- 
tions/propositions are defined/proved below in order to 
make it possible to compare the state of two entities by 
utilizing summarizing version vectors, version vectors, 40 
and update stamps. An entity hereinafter represents an 
object container, an object, or a differential update. 
[0079] Definition: z = y.getTime(x) is a function 
which returns the time stamp z in y that corresponds to 
the object container with the identifier x. y represents 45 
the summarizing version vector of an object container, 
the version vector of an object or the update stamp of a 
differential update. If x does not appear in y, then the 
returned time stamp z is 0. 

[0080] Definition : Let y, and y 2 are either the sum- 50 
marizing version vector of an object container, the ver- 
sion vector of an object or the update stamp of a 
differential update, y, is equal to Y 2 if 
y 1 .getTimc(x) = y 2 .getTime(x) for any object con- 
tainer's identifier in y, and Y 2 . y, is newer than y 2 if y, is ss 
not equal to y 2 and y , .getTime(x) >=y 2 .getTime(x) 
for any object container's identifier in y, and y 2 .yi is 
older than y 2 if y, is not equal to y 2 and 



y , .getTime(x) <= y 2 .getTime(x) for any object con- 
tainer's identifier in y, and y 2 . y, conflicts with y 2 if y, is 
neither equal to y 2 , nor newer than y 2 , nor older than y 2 . 
[0081] Proposition: Let e, and e 2 represent either an 
object container, an object, the base of an object, or a 
differential update and y, and y 2 are the summarizing 
version vector, the version vector, or the update stamp 
of e<! and e 2 , respectively, e, has the current state same 
as or consist with e 2 's if y, is equal to y 2 . e, has the cur- 
rent state newer than e 2 's if y, is newer than y 2 . e, has 
the current state older than e 2 's if y, is older than y 2 . e, 
has the current state conflicting with e 2 's if y, is conflict- 
ing with y 2 . 

[0082] Prove : By definition, if y, is equal to y 2 , then 
any update contained in e, is also contained in e 2 and 
vice versa. So obviously e, has the current state same 
as e 2 if y, is equal to y 2 . Next assume that y, is newer 
than y 2 and the current state of e, is not newer than e 2 s. 
By definition, if y, is newer than y 2 , then e, must contain 
updates which are not contained in e 2 and any update 
contained in e 2 must have been propagated into e v So 
the assumption does not hold. The other part of the 
proposition can be proved similarly. (Prove ends.) 
[0083] Definition: Let y, and y 2 be the summarizing 
version vectors of object containers e, and e 2 , respec- 
tively. r y is called the common version vector of y, and 
y 2 if and only if r y contains only the identifiers of object 
containers that exist in both y, and y 2 and the time 
stamp in r y corresponding to each of such identifiers is 
the smaller of the corresponding time stamps in y, and 

Y2- 

[0084] Proposition : Let e, and e 2 be two different 
object containers and y , and y 2 be the summarizing ver- 
sion vectors of y, and y 2 , respectively. Further, let r y be 
the common version vector of y, and y 2 . r y then repre- 
sents the state from which e, and e 2 diverged. 
[0085] Prove : By definition of y, and y 2 , if an update 
has been contained in both e, and e 2 , then it must be 
reflected in both y, and y 2 , and therefore the time stamp 
associated with the update must be equal or less than 
the corresponding time stamps in y, and y 2 . In other 
words, the time stamp associated with the update is 
equal to or less than to the smaller of the two corre- 
sponding time stamps in y, and y 2 . By definition of the 
common version vector, the update is reflected in r y 
[0086] If an update is only contained in either e, or e 2 , 
then the time stamp associated with the update must be 
greater than the corresponding time stamp in the sum- 
marizing version vector of the other object container. 
Therefore, the update can not have been reflected by 
the summarizing version vector of the object container 
absent it. By the definition of the common version vec- 
tor, the update also can not have been reflected by r y 
(Prove ends.) 

[0087] It is appreciated that whenever a modification 
to an object is made to an object container, that modifi- 
cation will be reflected into the object container by call- 
ing the put method as shown in Figure 3 on the object 
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container. The put method will in turn cause the object 
container to update its summarizing version vector, 
update the version vector of the object to be modified or 
generate an update stamp associated with the differen- 
tial update to be added, and add the reference to the 
modified object or added differential update and the ref- 
erence to the corresponding version vector or update 
stamp to the update log of the object container. The 
update stamp of a differential update means that the dif- 
ferential update was generated by the object container 
with the identifier shown in the first field of the update 
stamp at the time shown in the second time stamp in the 
update stamp. 

[0088] Based on the description above, it is realizable 
to figure out the difference of an object container from 
another, detect concurrent update conflicts between two 
object containers, and reduce the number of entries in 
update logs. Let l s denote object container i that is going 
to pull updates from another object container j denoted 
as R s . The concrete algorithms for the computation 
mentioned above are as follows. 
[0089] Difference Computing: - Assume R s which 
has the summarizing version vector sw' has received 
the summarizing version vector sw 1 of l s and the identi- 
fiers E 1 of the objects which need to be sent to l s as 
whole objects. Then the common version vector rw'' of 
sw' and sw* can be calculated based on the definition 
above and all the updates need to be sent to l s are 
determined as follows. 

[0090] First for any object which does not support dif- 
ferential synchronization, the version vector of the 
object is checked. The whole object will be sent as an 
update to l s if its version vector is newer than rw 1 '. Let 
U1 be the group of updates obtained by the above com- 
puting. 

[0091] Secondly, for any object which supports differ- 
ential synchronization, the version vector of the base of 
the object is checked. The whole object will be sent as 
an update to l s if its base version vector is newer than 
rw''. Let U2 be the group of updates obtained by the 
above computing. 

[0092] Next, for any object which supports differential 
synchronization, if there are differential updates applied 
to it, then the update stamp of each of the differential 
updates is checked. The whole object will be sent as an 
update to l s if the update stamp of any such differential 
update of the object is newer than or conflicting with rw 1 ' 
and the identifier of the object is contained in E' or any 
differential update of the object will be sent as an update 
to l s if the update stamp of the differential update is 
newer than or conflicting with rw 1 '. Let U3 be the group 
of updates obtained by the above computing. 
[0093] Finally, let U be the union of U1, U2 and U3. 
The order of all updates in U is so that if an update 
whose version vector or update stamp is newer than 
another than the version vector or update stamp of 
another update then the first update is before the sec- 
ond update. If two updates in U have conflicting version 



vectors or update stamps, then if the two updates are 
differential updates the order between the two updates 
must be consistent with the order of them in the update 
log of R s otherwise if let v1 and v2 be the version vec- 

5 tors or update stamps of the two updates, respectively, 
the two updates will be ordered as follows. 
[0094] Sequentially pick up an identifier sli of object 
container in y1 from the beginning toward the end and 
examine the conditions described in the following until 

10 having found an sli which meets one of the condi- 
tions. If there exists a sli such that 
v1 .getTime(sli)> v2.getTime(sli) then put the update 
with v1 before the update with v2. If there exists a sli 
such that v1.getTime(sli)< v2.getTime(sli) then put the 

is update with v1 after the update with v2. One of the con- 
dition will be finally met because no two version vectors 
in the same object container will be the same because 
of the monotonic feature of the time stamps generated 
by the same object container. 

20 [0095] U is then the set of updates representing the 
difference of l s from R s which will be sent to l s . 
[0096] Concurrent Conflict Detection: - The detec- 
tion of concurrent update conflicts between two objects, 
belonging to l s and R s , respectively, is done after l s 

25 received an update related to the two objects. 

[0097] If the received update is a whole object along 
with its version vector, then the version vector of the 
received update is compared with the version vector of 
the object in l s . If the received object has a version vec- 

30 tor newer or older than the version vector of the object in 
l s , then there will be no conflict between the two objects 
and the received object will either replace the object in 
l s or be thrown away. If the version vector of the two 
objects conflicts, then the two objects are in conflicts 

35 and the predefined reconcile method will be called on 
the object in the l s and R s . 

[0098] If the received update is differential update with 
its update stamp, then the differential updates associ- 
ated with the corresponding object in l s will be com- 

w pared with the common version vector rw 1 ' of the 
summarizing version vectors of I and R. If no differential 
update of the object in l s is newer than or conflicts with 
iw'', then there is no conflict between the object in l s 
and the corresponding object in R s . Otherwise, all differ- 

45 ential updates of the object in l s that are newer than or 
conflicting with rw'' conflict with the received update 
and they will be all passed to the reconcile method of 
the object in l s to resolve the conflict. 
[0099] Update Log Reduction: - For each of Is and 

so Rs, called the first object container, if the other object 
container, called the second object container, is among 
the first object container's selected set of object contain- 
ers that are considered in the process of the update log 
reduction of the first object container, then the summa- 

55 rizing version vector of the summarizing object con- 
tainer will be recorded in the first object container and 
the following steps for the update log reduction will be 
executed. First, the summarizing version vectors of 



13 



25 



EP 0 974 895 A2 



26 



other object containers that were recorded in the first 
object container are checked. If the summarizing ver- 
sion vectors of all the object containers selected by the 
first object container has been recorded, the latest com- 
mon version vector of the first object container will be 5 
computed or re-computed. The latest common version 
vector of the first object container computed here con- 
tains only the object container identifiers which are 
included in each of the summarizing version vectors of 
the other object containers recorded in the first object J0 
container and the summarizing version vector of the first 
object container itself. Each time stamp in the latest 
common version vector of the first server is the smallest 
of the corresponding time stamps in the summarizing 
version vector of the other object containers recorded in 15 
the first object container and the summarizing version 
vector of the first object container itself. Next, if the lat- 
est common version vector of the first object container is 
updated, all the log entry in the update log of the first 
object container are scanned, and the updates, that is, 20 
the deletion of whole objects and the records of differen- 
tial updates whose version vector or update stamps 
whose version vectors or update stamps are older than 
or equal to the latest common version vector of the first 
object container, are purged off from the update log of 25 
the first object container. As can be seen from its defini- 
tion above, the latest common version vector of the first 
object container represents the latest state of the object 
container which are common to all the other object con- 
tainers selected by the first object container and the first 30 
object container itself. Therefore, the semantics of the 
purging off of the updates in the update log of the first 
object container is that any update which has propa- 
gated in all the other object containers selected by the 
first object container is eventually removed from the first 35 
object container. This, in general, can free some of the 
memory and disk space used by the first object con- 
tainer. 

[0100] A program listing in C describing the above 
system is contained in the Appendix. 40 
[0101] Having now described a few embodiments of 
the invention, it should be apparent to those skilled in 
the art that the foregoing is merely illustrative and not 
limiting, having been presented by way of example only. 
Numerous modifications and other embodiments are 45 
within the scope of one of ordinary skill in the art and 
are contemplated as failing within the scope of the 
invention as defined by the appended claims equivalent 
thereto. 

50 

Claims 

1. A universal system for synchronizing objects at 
servers which accommodates wide area mobile 
computing, comprising: 55 

a network of primary servers; 

high performance reliable links connecting said 



primary servers; 

means for automatically synchronizing data at 

said primary servers; 

a number of secondary servers; 

means for coupling a first secondary server to 

one of said primary servers by a link; and, 

means at a secondary server for synchronizing 

objects thereat with objects at any primary 

server to which it can be connected, upon 

establishment of a link there between. 

2. The system of Claim 1 , and further including a sec- 
ond secondary server and a synchronization link 
between said first and second secondary servers, 
whereby said system supports client/server and 
peer-to-peer synchronization modes. 

3. A universal system for synchronizing objects at 
servers which accommodates wide area mobile 
computing, comprising: 

a network of primary servers; 

high performance reliable links connecting said 

primary servers; 

means for automatically synchronizing objects 
at said primary servers; 
number of secondary servers; 
means for coupling a secondary server to one 
of said primary servers by a link; and, 
means at a secondary server and initiated 
thereby for synchronizing objects thereat with 
objects at any primary server to which it can be 
connected, whereby if said link between said 
secondary server and said primary server is 
less reliable, synchronization may nonetheless 
take place when said secondary server deter- 
mines that a reliable link can be established 
between itself and the primary server to which 
it can be connected. 

4. The system of Claim 3, wherein said means for syn- 
chronizing includes means at said secondary sever 
for ascertaining the reliability of said link and for ini- 
tiating synchronization only when a sufficiently reli- 
able link can be established. 

5. The system of Claim 4, wherein said ascertaining 
means includes means lor determining the band- 
width for the transfer of information over said link. 

6. The system of Claim 4 wherein said ascertaining 
means include means for ascertaining the available 
resources at said secondary server. 

7. The system of Claim 3 wherein said synchronizing 
means include means at said second server for 
manually initiating said synchronization. 
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8. A system for synchronizing objects at two servers, 
comprising: 

servers at two separate locations; 
a network linking said servers; 
means at one of said servers for synchronizing 
objects at said servers either on a whole object 
basis or a differential basis; and, 
means for causing said synchronizing means 
to switch between whole object synchroniza- 
tion and differential synchronization. 

9. The system of Claim 8 wherein said means for 
causing said synchronizing means to switch 
includes means for ascertaining the available 
resources at one of said servers and for switching 
to whole object synchronization when resources at 
a server cannot support differential synchroniza- 
tion. 

10. A method for synchronizing objects at two servers 
so as to minimize the amount of data transmitted by 
avoiding the necessity for exchanging version vec- 
tors of individual objects, comprising the steps of: 

providing an object container at each of the 
servers, each of the object containers adapted 
to contain an object; 

providing a summarizing version vector for 
each object container that summarizes the 
state of each of the objects at a server, the 
summarizing version vector summarizing the 
state of the object container and having update 
stamps each having a field for the identifier 
associated with the object container and a field 
for a time stamp which corresponds to the last 
time when the object container created, modi- 
fied or deleted any object thereat, the time 
stamp being generated by the object container 
when an object therein is created, modified, or 
deleted; and, 

initiating synchronizing by transmitting only a 
single summarizing version vector from a first 
server to a second server, and returning 
updates all at once from the second server to 
the first server if the associated version vectors 
of objects at the second server are newer than 
or conflict with the received summarizing ver- 
sion vector from the first server, or if the 
updated time stamps of the individual objects 
at the second server are newer than or conflict 
with the time stamps associated with the sum- 
marizing version vector from the first server. 

11. The method of Claim 10, wherein said synchroniz- 
ing step includes performing differential synchroni- 
zation based on the summarizing version vectors 
and the update stamps to create differential 



updates, such that the differential synchronization 
is accomplished by transmitting only those parts of 
objects which have changed as indicated by the dif- 
ferential updates. 

5 

12. The method of Claim 11, and further including the 
steps of generating a latest common version vector 
and using the latest common version vector to 
purge off selected differential updates. 

10 

13. The method of Claim 10 and further including the 
step of restarting a synchronization from the point 
of a previous failure with data from an unaffected 
server, with the point of failure and the existence of 

is an unaffected server being determined by the sum- 
marizing version vectors associated with the serv- 
ers. 

14. The method of Claim 10, and further including the 
20 steps of updating both the version vector of an 

object corresponding to an update from a first 
server to a second server, and the summarizing 
version vector in the second server immediately 
after the second server receives an update from the 

25 first server, such that synchronization can be 
restored without resending updates which are 
already received by the second server by compar- 
ing the summarizing version vector of the first 
server with the updated version vector of the sec- 

30 ond server, thus to provide fine grain synchroniza- 
tion. 

15. A universal synchronization system in which 
objects which exist at two servers are synchronized 

35 regardless of object type, comprising: 

servers at separate locations, each having a 
version of an object and an update vector asso- 
ciated therewith; 

40 a network linking said servers; 

means at each server for separating the 
semantics of the objects from the synchroniza- 
tion including means for extracting object 
updates in a standard format regardless of 

45 object format and means for synchronizing the 

objects at said servers based on a standard 
protocol utilizing said standard format object 
updates, whereby synchronization can take 
place regardless of the form of the object to 

so permit synchronization between different sys- 

tems. 

16. A method for propagating updates from a first 
server to a second server and detecting and resolv- 

55 ing conflicts in the synchronization of objects at the 
two servers, comprising the steps of: 

sending a summarizing version vector from the 
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first server to the second server; 
upon receiving the summarizing version vector 
of the first server, sending the summarizing 
version vector of the second server back to the 
first server followed by all of the identifiers of 5 
objects which exist in the second server and 
that can support differential synchronization; 
upon receiving the summarizing version vector 
and identifiers from the second server, deter- 
mining at the first server the identifiers from the 10 
received identifiers corresponding to which the 
objects do not exist in the first server; 
calculating at the first server the common ver- 
sion vector of the summarizing version vectors 
of the two servers; is 
determining at the first server all of the identifi- 
ers of objects that can support differential syn- 
chronization and which version vectors to the 
bases, absent all of the corresponding differen- 
tial updates, are newer than the calculated 20 
common version vector, the first and second 
determined identifiers of objects being those 
objects which can not realize differential syn- 
chronization in the undergoing synchroniza- 
tion, and which therefore must be switched to 25 
whole object synchronization; 
sending the identifiers of objects which must 
switch to whole object synchronization to the 
second server; 

upon receiving the identifiers from the first 30 
server, calculating at the second server the 
common version vector of the summarizing 
version vectors of the two servers; 
determining at the second server the identifiers 
of objects different from the received identifiers 35 
that can support differential synchronization 
and which version vectors to the bases, absent 
all of the corresponding differential updates, 
are newer than the calculated common version 
vector, the determined objects being those 40 
objects which can not realize differential syn- 
chronization in the undergoing synchroniza- 
tion, and therefore must be switched to whole 
object synchronization; 

adding the received identifiers to the deter- 45 
mined identifiers as the set of identifiers of 
objects which, if sent from the second server to 
the first server, must send as whole objects; 
comparing at the second server the version 
vectors of individual objects, including those so 
that can not support differential synchroniza- 
tion and those that can support differential syn- 
chronization and which identifiers are in the set 
of identifiers determined by the second server, 
with the summarizing version vector of the first 55 
server; 

extracting the objects which have a version 
vector newer than or conflicting with the sum- 



marizing version vector of the first server as the 
updates that are the whole objects; 
comparing at the second server the update 
stamps of individual differential updates that 
were applied to the objects at the second 
server which have an identifier not in the set of 
identifiers determined by the second server; 
extracting the differential updates which have 
an update stamp newer than or conflicting with 
the summarizing version vector of the first 
server; 

sending all of the extracted updates along with 
their corresponding object identifier and ver- 
sion vectors or update stamps from the second 
server to the first server in a consistent order; 
upon finishing sending the updates to the first 
server, purging off the latest common ancestor 
version vector of the second server and some 
or all of the differential updates at the second 
server; 

upon receiving an update along with the corre- 
sponding object identifiers and version vector 
or update stamp from the second server, deter- 
mining whether the received update is a whole 
object or differential update; 
comparing the received version vector with the 
version vector of the corresponding object at 
the first server if the received update is a whole 
object and throwing away the received object if 
the received version vector is older than or 
equal to the version vector of the object at the 
first server; 

replacing the object at the first server with the 
received object if the received version vector is 
newer than the version vector of the object at 
the first server, or identifying that there is a con- 
flict between the object at the first server and 
the received object and having the object at the 
first server resolve the conflict if the received 
version vector conflicts with the version vector 
of the object at the first server; 
comparing the received update stamp with the 
summarizing version vector of the first server if 
the received update is a differential update; 
throwing away the received differential update 
if the received update stamp is older than or 
equal to the summarizing version vector of the 
first server; 

otherwise comparing the summarizing version 
vector of the second server with the update 
stamp of each differential update at the first 
server which was applied to the corresponding 
object at the first server; 
extracting all the differential updates which 
have an update stamp newer than or conflicting 
with the summarizing version vector of the sec- 
ond server; 

applying the received differential update to the 
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object at the first server if there is no differential 
update extracted at the first server, or identify- 
ing that the received differential update con- 
flicts with the differential updates extracted at 
the first server; 5 
having the object at the first server resolve the 
conflict; 

upon finishing receiving the updates from the 
second server, updating the latest common 
ancestor version vector of the first server; and, io 
purging off some or all of the differential 
updates at the first server. 
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